AITopics | two-timescale regime

Leveraging the two-timescale regime to demonstrate convergence of neural networks

Neural Information Processing SystemsFeb-17-2026, 03:59:42 GMT

Artificial neural networks are among the most successful modern machine learning methods, in particular because their non-linear parametrization provides a flexible way to implement feature learning (see, e.g., Goodfellow et al., 2016, chapter 15).

artificial intelligence, machine learning, neuron, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Switzerland (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Leveraging the two-timescale regime to demonstrate convergence of neural networks

Neural Information Processing SystemsDec-26-2025, 19:26:22 GMT

We study the training dynamics of shallow neural networks, in a two-timescale regime in which the stepsizes for the inner layer are much smaller than those for the outer layer. In this regime, we prove convergence of the gradient flow to a global optimum of the non-convex optimization problem in a simple univariate setting. The number of neurons need not be asymptotically large for our result to hold, distinguishing our result from popular recent approaches such as the neural tangent kernel or mean-field regimes. Experimental illustration is provided, showing that the stochastic gradient descent behaves according to our description of the gradient flow and thus converges to a global optimum in the two-timescale regime, but can fail outside of this regime.

demonstrate convergence, neural network, two-timescale regime, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)

Add feedback

cd062f8003e38f55dcb93df55b2683d6-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 23:46:01 GMT

artificial intelligence, machine learning, optimization problem, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Switzerland (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Leveraging the two-timescale regime to demonstrate convergence of neural networks

Neural Information Processing SystemsJan-19-2025, 22:24:23 GMT

We study the training dynamics of shallow neural networks, in a two-timescale regime in which the stepsizes for the inner layer are much smaller than those for the outer layer. In this regime, we prove convergence of the gradient flow to a global optimum of the non-convex optimization problem in a simple univariate setting. The number of neurons need not be asymptotically large for our result to hold, distinguishing our result from popular recent approaches such as the neural tangent kernel or mean-field regimes. Experimental illustration is provided, showing that the stochastic gradient descent behaves according to our description of the gradient flow and thus converges to a global optimum in the two-timescale regime, but can fail outside of this regime.

demonstrate convergence, neural network, two-timescale regime, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.67)

Add feedback

Leveraging the two timescale regime to demonstrate convergence of neural networks

Marion, Pierre, Berthier, Raphaël

arXiv.org Machine LearningOct-25-2023

We study the training dynamics of shallow neural networks, in a two-timescale regime in which the stepsizes for the inner layer are much smaller than those for the outer layer. In this regime, we prove convergence of the gradient flow to a global optimum of the non-convex optimization problem in a simple univariate setting. The number of neurons need not be asymptotically large for our result to hold, distinguishing our result from popular recent approaches such as the neural tangent kernel or mean-field regimes. Experimental illustration is provided, showing that the stochastic gradient descent behaves according to our description of the gradient flow and thus converges to a global optimum in the two-timescale regime, but can fail outside of this regime.

artificial intelligence, machine learning, neuron, (17 more...)

arXiv.org Machine Learning

2304.09576

Country:

North America > United States > Ohio (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Switzerland (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Filters

Collaborating Authors

two-timescale regime

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Leveraging the two-timescale regime to demonstrate convergence of neural networks

Leveraging the two-timescale regime to demonstrate convergence of neural networks

cd062f8003e38f55dcb93df55b2683d6-Paper-Conference.pdf

Leveraging the two-timescale regime to demonstrate convergence of neural networks

Leveraging the two timescale regime to demonstrate convergence of neural networks